SSIS

Ratings:
(4.2)
Views: 1641
Banner-Img
Share this blog:

Microsoft SSIS

(SQL Server Integration Services)

SQL Server Integration Service is an ETL tool.  By using SSIS we can create a data transformation service (Extract data from various operational sources like Excel, flat files, SQL Server, Oracle, etc).  transform the source business data by using existing transformation in the staging area of the transformation area and load and store it into the destination database or file system.

Microsoft SSIS

SSIS Architecture is categorized into two components

1. SSIS Runtime Engine

The SSIB Runtime engine completely handles the control flow of the package.

Control flow: The control flow of a package defines actions that are to be executed when the package run. The control flow contains various tasks and containers as well.

Task: A unit of work in a workflow.

For example, Data flow task, execute SQL task, etc.

Container: Container is used to divide the package into multiple blocks.

For example: For loop container, for each look container, sequence container, and task post container.

2. Data flow transformation pipeline Engine

The data flow transformation pipeline engine completely handles the data flow of the package. The data flow contains, data flow source (excel source, flat file, OLEDB, etc), data flow transformations (conditional split transformation) derived column transformation, lookup. Transformation etc.) and data flow destination.

Note: Whenever the data flow task occurs in control flow the SSIS Runtime engine throws the control from control flow to data flow to achieve or to run the ETL process while the package is running.

Connection Manager: A logical connection between SSIS application and database or file system.

Note: Connection Manager can be established by using various providers in SSIS.

Package: Package is a core component of SSIS. The package can be created simple graphical user interphases or programmatically Data conversion transformation

Data conversion transformation is used to concern the data men one data type to another data type and also adds a new column to the data set.

Steps to configure data conversion transformation

Start-----> 

All Programmes-----> 

Select Microsoft SQL server 2005-----> 

(to implement all the BI activities)-----> 

Select file menu----->

Select New-----> 

Select Project-----> 

Business intelligence project options under project types-----> 

Integration services project under templates section-----> 

Change the project name as (evening 7:30 Batch)-----> 

Change the project location-----> 

Click ok-----> 

Select package .dtsx in solution explorer and rename it as data conversation .dtsx----->

 In control flow, drag and drop the data flow task and rename it as data flow task space data conversion-----> 

Select the data flow task and right-click and select the edit option from the right-click popup menu.----->

 In the data flow tab drag and drop OLEDB Source-----> 

Double click on OLEDB Source to edit it.-----> 

In the connection manager page, click new to create a new connection manager.-----> 

Click new-----> Provide server name (local hast (or). (or) server name)-----> 

Select data base name (Adventure works from the drop-down list)-----> 

Click test connection to evaluate the connection-----> 

Click ok-----> 

Click ok-----> 

Select table or view option from data access mode drop-down list-----> 

Select human resources, employee table name from the drop-down list----->

Inclined to build a profession as MSBI Developer? Then here is the blog post on, explore MSBI Training

MISSING LINE PG:16   ------>  

Click ok   ------>  

Drag and drop data conversion transformation and make a connection from OLEDB source to data conversion transformation

Double click on data conversion transformation   ------>  

Check title and marital status from available input columns and change the data type from to string (Dt – STR) and also rename lias columns as title DC and Marital click ok to save the changes.   ------>  

Drop OLEDB destination from the data section.   ------>  

Ke   MISSING LINE PG:17   connection from data conversion transformation  MISSING EDB destination.  ------>  

Double click On OLEDB destination   ------>  

The connection manager page clicks new to create MISSING LINE, 17 connection manager.   ------>  

New   ------>  

Driver destination server name (Local host) and MISSING LINE PG:17 ture works data base from the drop MISSING LINE PG:17   ------>  Test connection   ------>  

Click ok   ------>  

Click ok   ------>  

Click new to create destination table remove MISSING LINE PG:17 title and copy of marital status MISSING LINE PG:17 me the table as converted data.   ------>  

 Click ok   ------>  

MISSING LINE PG:17 Options from left panel to make MISSING LINE PG:17  between input columns or source columns

Steps to execute SSIS package

 In Business intelligence development studio, Alt+ control + L for solution explorer, select data conversion .dtsx package

Right-click select execute the package

Derived column transformation:

The derived column transformation enables in-line transformations using SSIS expressions to transform the data. The typical user of Derived column transformation is to create or drive new columns by using the existing source columns or by using variable or by using available functions.

Steps to configure Derived Column transformation:

Start   ------>  

All programs ------>  

Microsoft SQL Server 2005   ------>  

Select the Microsoft business intelligence development studio.   ------>  

Select file menu   ------>  

Select new   ------>  

Projects   ------>  

Select business Intelligence Projects option   ------>  

Integration Services project   ------>  

Change project location and name   ------>  

Click ok   ------>  

In the Business Intelligence Development Studio, in the control flow drag and drop data flow task and rename it is the data flow task space derived column.   ------>  

Double click on the data flow task to edit it   ------>  

In the data flow, drag and drop OLEDB source   ------>  

Double click on OLEDB source to configure it   ------>  

Select click new to create source connection manager.   ------>  

Click new   ------>  

In the connection manager editor provide server name (Local host (or) (or) server name)   ------>  

Select adventure works from the drop down list   ------>  

Click ok twice   ------>  

Select human resources employee address table from the drop down list   ------>  

Select columns from left panel   ------>  

Click ok   ------>  

Open SSMS (SQL Server Management Studio) and run the following query to create destination table.   ------>  

Create table [Derived column](

[Employee ID] Integer,

[Address ID] Integer,

[Row guid] Unique Identifier,

[Modified data] DATE-TIME,

[Ref Date DATE TIME)

Go to Business Intelligence development studio and drag and  drop derived column transformation and make a connection from OLEDB source to derived column using a green data flow path.   ------>  

Double click on derived column   ------>  

Define the following expression

Derived column name                        Expression                   Data type                    Length

Ref date                                  (DT_DB Date) get dated         DT-DB date

Note: The above expression is defined by drag and drop get date function from the date-time functions section and remove the derived column 1 as Ref date the some will be carry far worded to the destination in our scenario.

Execute SQL Task:

Execute SQL Task is used to execute relational queries such as DDL, DML against connection.

Basic Parameters or Properties of executive SQL Task Connection:  

In executing the SQL task connection is nothing but connection Manager. Provide the following steps to create connection manager in execute SQL task.

Open Business Intelligence Development studio   ------>  

Create a new package and rename it as execute SQL .dtsx   ------>  

In control flow drag and drop execute SQL task on to design area   ------>  

Double click on execute SQL task to edit it   ------>  

Provide the following steps   ------>  

Select new connection   ------>  

Click new   ------>  

Provide server name (Localhost or) (or) server name)   ------>  

Select an adventure work database from the drop-down list.   ------>  

Click test connection to evaluate the connection between the database and SSIS   ------>  

Click ok   ------>  

Click ok   ------>  

SQL source type – Select direct Input (default)

Note: here, we have 3 options

  1. Direct input – User can provide or enter SQL command directly on an available notepad
  2. File Connection – We can pan the query through any file system by providing path and file name
  3. Variable - We can pass the query through available which is already declared it on SSIS variable section.

SQL Statement:

Truncate table any valid table name

Click ok to save the changes   ------>  

In solution explorer (ALT+ctrl=L), select execute SQL dtsx   ------>  

Right-click and select execute package option (2nd option)

Execute package task:

The execute package task is used to execute a package with in the parent package.

Steps To Configure Execute Package Task

Open Business Intelligence Development studio in solution explorer create a new package and rename it is exec pkg. dtsx.   ------>  

In control flow, drag and drop execute package task   ------>  

Rename the execute package task as EPT calling exec SQL, package   ------>  

Double click on execute package task to configure it   ------>  

Select package option from left pane and set the following properties   ------>  

Location – select file system   ------>  

Connection – select new connection   ------>  

Click browse   ------>  

Navigate to the path where the package is stored in,   ------>  

Select execute SQL dtsx   ------>  

Click open   ------>  

Click ok   ------>  

*   ------>  

In solution explorer, select execute package dtsx   ------>  

Right click and select execute package option.   ------>  

The linked package will be automatically set open by the current package and then executes.

 

Variables:

In SSIS  the variables are categorized into two parts

  1. System variables
  2. User-Defined Variables

System Variables:

System Variables  are built-in variables and system variables can be accessed throughout the package

For Example :- System :: (reation name

System: :Package Name

System:: Task ID etc.,

Note: System Defined variables can be identified system scope resolution operated (:J. That means all system variables should start with the system::

User-Defined variable: 

User-Defined variable cab be created by the developer and user-defined variable can have its own name, data type, value, and scope as well

Note: User-Defined variables can be identified by USER:: variable name

  1. If we create a variable with respect to a package the scope of that particular variable is complete package.
  2. If we create a variable with respect to a container that variable can be used or accessed in the entire container.
  3. If we create a variable with respect to a task the scope of the variable is within the specified task only

Example: Package for excel source

Open Business intelligence development studio   ------>  

Create new package and rename it as excel source .dtsx   ------>  

In control flow drag and drop data flow task   ------>  

Double click on the data flow task to configure it   ------>  

In the Data flow, drag and drop excel source   ------>  

Double click on excel source to configure it   ------>  

Click  new to create a connection manager for excel   ------>  

Click browse and select excel alliance details excel file and click open   ------>  

Ensure that the first row has column names checkbox is checked   ------>  

Click ok   ------>  

Select data access  mode as table or view   ------>  

Select sheet 1 from drop-down list   ------>  

Select columns   ------>  

Click ok

Note: Prepare the following excel file i.e already linked or connected to excel source

Source code SBAT Type Partner type Funded Amount
81818 MSA Builder 50000
81540 B1 Realtor 40000
12345 MAP Realtor 9000

Merge Transformation:

The Marge Transformation combines two sorted data sets into single output based on values in their key columns. This transformation requires that the inputs or sources are sorted and then Merged columns must have the same data type.

Steps to configure merge Transformation

Open business intelligence development studio   ------>  

In the integration services project, create a new package and rename it as merge .dtsx   ------>  

In control flow drag and drop data flow task and rename it as data flow task merge.   ------>  

In data flow drag and drop OLEDB source rename it as source 1   ------>  

Double click on source 1 to configure it.   ------>  

Select provided connection manager it exists   ------>  

Select Human Resources. Employee table from drop down list   ------>  

Select columns from the left pane and click ok to save changes   ------>  

Right-click on source 1 and select show advanced editor and set the following properties,   ------>  

Select the input and output properties tab,   ------>  

Selected, OLEDB source output and set,   ------>  

Expand OLEDB source output and also expand output columns.   ------>  

Select 1st column through which we are going to make a mapping between two sources (employee ID) and set,   ------>  

Sort key portion - 1

Click Refresh   ------>  

Click ok   ------>  

Drag and drop another OLEDB source and rename it as source 2   ------>  

Double click on source 2 provide connection manager if exists   ------>  

Select Human Resources. Employee address form drop-down list   ------>  

Select columns   ------>  

Click ok   ------>  

Right-click on source 2 select advanced editor to sort the data and set the following properties   ------>  

Select the input and output properties tab,   ------>  

Select OLEDB source output and set,   ------>  

1s sorted – true   ------>  

Expand OLEDB source output and also expand output columns,   ------>  

Select the 1st column through which we are going to make a mapping between two sources (employee ID) and set,   ------>  

Sort key portion -1   ------>  

Click refresh   ------>  

Click ok   ------>  

Drag and drop merge transformation   ------>  

Make a connection from source 1 to merge and select merge input 1 option in input, output selection editor   ------>  

Click ok   ------>  

Make a connection from source 2 to merge   ------>  

Double click on merge to make sure that all columns are mapped.   ------>  

Drag and drop OLEDB destination, make a connection from merge to OLEDB destination   ------>  

Double click on OLEDB destination provide destination connection manager and click new to create destination table and rename the table as merged data   ------>  

Click ok   ------>  

Select mappings   ------>  

Click ok   ------>  

In solution explorer select the package and select execute package   ------>  

Merge Join Transformation:

The merge join transformation combines two sorted data sets into single output using inner join *default join), left outer and full outer joins.

MSBI Interview Questions

Steps to configure merge join Transformation:

Configure 2 OLEDB sources (Source 1, source 2) from the previous example

Drag and drop merge join transformation   ------>  

Make a connection from source 1 to merge join and select merge join left input option from input-output selection editor   ------>  

Make a connection from source 2 to merge join transformation   ------>  

Double click on merge join and select left outer join as join type and select the following columns from both source 1 and source 2   ------>  

Click ok   ------>  

Drag and drop OLEDB destination and make a connection from merge join to destination.   ------>  

Double click on OLEDB destination   ------>  

provide destination connection manager if exists and click new to create the destination table.   ------>  

Rename the OLEDB destination as merge join data.   ------>  

Click ok   ------>  

Execute package

Union All transformation: It combines multiple inputs into a single output. It differs from merge and MISSING LINE PG:30 transformation because union all doesn’t required sorted input. However  the first input is the reference input that all subsequent inputs must match the following criteria.

  1. of columns
  2. Data type
  3. Length
  4. Precision (Decimal point)

Conditional split transformation :

It routes the input data to different outputs based on case conditions if no case (or) conditions are met the data must be routed to the default output. The implementation of a conditional split is similar to case decision structure in general programming languages (switch case)

Scenario:

To test the package whether it is successfully executed (or) fail using conditional split, derived column transformation and union all.

Open Business intelligence development studio.   ------>  

Create a  new package and rename it as a test. dtsx   ------>  

In the control flow, drag, and drop the data flow task.   ------>  

In the data flow, drag and drop MISSING LINE PG:31   ------>  

Double click on OLEDB source to configure it.   ------>  

Provide connection manager if exists   ------>  

Select Human Resources. Employee table from the DDL.   ------>  

Select columns from left panel   ------>  

Clock ok   ------>  

Find out SRC row count   ------>  

Make a connection from SRC to row count   ------>  

Double click on Row count to edit it.   ------>  

In component properties tab, provide the following property   ------>  

Custom properties   ------>  

Variable name – uv src count   ------>  

Click refresh and click ok   ------>  

Drag and drop derived column transformation and make a connection from Row count to derived column   ------>  

Double click on derived column to define the execution date provide the following expression. Execution date – get date()   ------>  

Click ok   ------>  

Drag and drop row count transformation and rename as RC Dest count   ------>  

Make a connection from the derived column to Row count.   ------>  

Double click on Row count and provide the following property   ------>  

Custom properties   ------>  

Variable name – UNDST count   ------>  

Click refresh   ------>  

Click ok   ------>  

Drag and drop OLEDB destination and make a connection from row count to destination   ------>  

Double click on a destination to configure it.   ------>  

Provide destination connection manager if exists.   ------>  

Click new to create a destination table if it is not existing and rename the destination table as tested – data.   ------>  

Click ok   ------>  

Select mappings   ------>  

Click ok   ------>  

Note: Define the following variables in control flow write package.

Name  ā€‹ Data Type Value
UVsre count Int 32  
UVDst count Int 32  
UV Solution Name String Morning 8:30 batch
UV table Name String Tested Data

In control flow, drag and drop data flow task   ------>  

Rename it as Data flow task test condition   ------>  

Double click on data flow task   ------>  

In the data flow, drag and drop OLEDB source   ------>  

Double click on OLEDB src to configure it   ------>  

Provide an src connection manager if exists and set the following properties.   ------>  

Data access mode – select SQL to command text – Provide the following query to fetch execute date,   ------>  

Select distinct get ate () as (execution date) from (tested – data)   ------>  

Drag and drop derived column transformation to derive the following columns using the existing variables.   ------>  

Make a connection from OLEDB src to the derived column.   ------>  

Double click on derived column   ------>  

Solution name @ ( user:: UV solution name)   ------>  

Package name @ (syst_) :: Package name)   ------>  

Table name @ (user :: UV table name)   ------>  

Source count @ (user :: UV src count)   ------>  

Destination count @ (user :: UV dst count)   ------>  

Click ok   ------>  

Drag and drop, conditional split transformation to check the condition   ------>  

Make a connection form derived column to conditional split.   ------>  

In condition split transformation editor, provide the following condition.   ------>  

Output Name                                                                                      Condition

Case 1                                                                                     (source count)

(Destination count)

Rename case 1 as src count is equal to Dst count   ------>  

Rename conditional split default output as src count is not equal to Dct count   ------>  

Click ok   ------>  

Drag and drop derived column transformation and make a connection from conditional split to the derived column.   ------>  

Select src count is equal to dst count from input/output editor.   ------>  

Rename the derived column 1 as success status.   ------>  

Double click on the derived column and derive the following expression    ------>  

Derived column name                                                                        Expression

Status                                                                                      “success”

Click ok    ------>  

Drag and drop derived column transformation and rename it as failure status.    ------>  

Make a connection from the conditional split to the derived column.    ------>  

Double click on failure status to define the status.    ------>  

Derived column name                                                                                    Expression

Status                                                                                                          “Failure”

Click ok    ------>  

Drag and drop union all transformation and make a connection from success status to union all and also make a connection from failure to union all.    ------>  

Drag and drop OLEDB destination to capture the log information    ------>  

Make a connection from union all to destination    ------>  

Double click on a destination to configure it.    ------>  

Provide destination connection manager it exists    ------>  

Click new to create new destination table and rename it as SSIS_Log    ------>  

Click ok    ------>  

Select mappings       ------> 

Click ok    ------>  

Execute package    ------>  

Scenario

For in-depth knowledge of MSBI click on

You liked the article?

Like: 0

Vote for difficulty

Current difficulty (Avg): Medium

EasyMediumHardDifficultExpert
IMPROVE ARTICLEReport Issue

About Author

Authorlogo
Name
TekSlate
Author Bio

TekSlate is the best online training provider in delivering world-class IT skills to individuals and corporates from all parts of the globe. We are proven experts in accumulating every need of an IT skills upgrade aspirant and have delivered excellent services. We aim to bring you all the essentials to learn and master new technologies in the market with our articles, blogs, and videos. Build your career success with us, enhancing most in-demand skills in the market.

Stay Updated
Get stories of change makers and innovators from the startup ecosystem in your inbox